feat: add phased decision report to orchestrator by WellDunDun · Pull Request #45 · selftune-dev/selftune

WellDunDun · 2026-03-14T13:45:27Z

Replace the bare-bones numeric summary with a structured decision report that explains each orchestration phase (sync, status, decisions, evolve, watch). Operators can now see exactly which skills were considered, which were skipped, and why — along with evolution results and watch alerts.

Changes

formatOrchestrateReport(): New pure function builds a 5-phase decision report from OrchestrateResult
JSON output: Now includes decisions array with per-skill action, reason, and outcome (deployed/validation/alert details)
Human output: Replaces bare summary with full report showing sync sources, status breakdown, per-skill decisions, evolution results, and watch phase
Test coverage: 11 new tests covering all report phases

Safe defaults remain intact; no changes to core telemetry semantics.

🤖 Generated with Claude Code

Replace the bare-bones numeric summary with a structured decision report that explains each orchestration phase (sync, status, decisions, evolve, watch) so operators can understand what the loop did and why. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-14T13:45:47Z

📝 Walkthrough

Summary by CodeRabbit

Release Notes

New Features
- Enhanced CLI orchestration reporting with dual output: structured JSON containing per-skill decisions and results on stdout, and a formatted human-readable decision report on stderr.
Tests
- Added extensive test coverage for report generation, validating formatting across all phases and scenarios.

Walkthrough

This PR adds human-readable report formatting to the orchestration CLI. It introduces helper functions to render phase-specific decision reports and exports a new formatOrchestrateReport function that converts orchestration results to formatted text. CLI output is restructured to emit JSON to stdout and formatted reports to stderr, with comprehensive test coverage validating formatting across various orchestration states.

Changes

Cohort / File(s)	Summary
CLI Reporting Functions `cli/selftune/orchestrate.ts`	Added `formatOrchestrateReport` and phase-specific helper functions (`formatSyncPhase`, `formatStatusPhase`, `formatDecisionPhase`, `formatEvolutionPhase`, `formatWatchPhase`). Restructured CLI output to emit JSON object with decisions array to stdout and human-readable formatted report to stderr. Replaced inline summary printing.
Test Coverage `tests/orchestrate.test.ts`	Added comprehensive tests for `formatOrchestrateReport` covering dry-run/auto-approve modes, sync source availability, repair info, status breakdown, per-skill decisions, evolution/watch results, and summary counts. Introduced `makeOrchestrateResult` helper function for test construction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

selftune#40: Directly modifies the same cli/selftune/orchestrate.ts CLI output path and exports, introducing the JSON stdout + stderr report structure that this PR tests and refines.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows conventional commits format with 'feat:' prefix and accurately describes the main change of adding a phased decision report to the orchestrator.
Description check	✅ Passed	Description is directly related to the changeset, detailing the new formatOrchestrateReport function, JSON output enhancements, human-readable output changes, and test coverage additions.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch WellDunDun/orchestrator-explain

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cli/selftune/orchestrate.ts`:
- Around line 165-179: In formatWatchPhase, avoid appending empty parentheses
when c.watchResult?.snapshot is missing by building a stats string only when
snapshot exists: compute passInfo and baseInfo from snap, join them into a
non-empty statsPart, and then only append ` (${statsPart})` to the line when
statsPart is truthy; leave alertTag and reason formatting unchanged. Ensure you
reference formatWatchPhase, the watched variable, c.watchResult?.snapshot
(snap), passInfo/baseInfo, and alertTag when making the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 37e0b1ca-27e1-400d-921e-5fd71478d03b

📥 Commits

Reviewing files that changed from the base of the PR and between aec5f4f and 9a0e025.

📒 Files selected for processing (2)

cli/selftune/orchestrate.ts
tests/orchestrate.test.ts

coderabbitai · 2026-03-14T13:48:48Z

cli/selftune/orchestrate.ts

+function formatWatchPhase(candidates: SkillAction[]): string[] {
+  const watched = candidates.filter((c) => c.action === "watch");
+  if (watched.length === 0) return [];
+
+  const lines: string[] = ["Phase 5: Watch"];
+  for (const c of watched) {
+    const snap = c.watchResult?.snapshot;
+    const passInfo = snap ? `pass_rate=${snap.pass_rate.toFixed(2)}` : "";
+    const baseInfo = snap ? `, baseline=${snap.baseline_pass_rate.toFixed(2)}` : "";
+    const alertTag = c.watchResult?.alert ? " [ALERT]" : "";
+    lines.push(`  ${c.skill.padEnd(20)} ${c.reason}${alertTag} (${passInfo}${baseInfo})`);
+  }
+
+  return lines;
+}


⚠️ Potential issue | 🟡 Minor

Empty parentheses appended when snapshot is missing.

If c.watchResult?.snapshot is undefined (snap falsy), passInfo and baseInfo are both empty strings, resulting in output like "SkillName reason ()". Consider omitting the parentheses entirely when there's no snapshot data.

Proposed fix

const snap = c.watchResult?.snapshot; const passInfo = snap ? `pass_rate=${snap.pass_rate.toFixed(2)}` : ""; const baseInfo = snap ? `, baseline=${snap.baseline_pass_rate.toFixed(2)}` : ""; const alertTag = c.watchResult?.alert ? " [ALERT]" : ""; - lines.push(` ${c.skill.padEnd(20)} ${c.reason}${alertTag} (${passInfo}${baseInfo})`); + const metrics = snap ? ` (${passInfo}${baseInfo})` : ""; + lines.push(` ${c.skill.padEnd(20)} ${c.reason}${alertTag}${metrics}`);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cli/selftune/orchestrate.ts` around lines 165 - 179, In formatWatchPhase, avoid appending empty parentheses when c.watchResult?.snapshot is missing by building a stats string only when snapshot exists: compute passInfo and baseInfo from snap, join them into a non-empty statsPart, and then only append ` (${statsPart})` to the line when statsPart is truthy; leave alertTag and reason formatting unchanged. Ensure you reference formatWatchPhase, the watched variable, c.watchResult?.snapshot (snap), passInfo/baseInfo, and alertTag when making the change.

Orchestrate output now explains each decision clearly so users can trust the autonomous loop. Adds formatOrchestrateReport() with 5-phase human report (sync, status, decisions, evolution, watch) and enriched JSON with per-skill decisions array. Supersedes PR #45. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

WellDunDun · 2026-03-15T10:32:33Z

superseded

* feat: add phased decision report to orchestrator Orchestrate output now explains each decision clearly so users can trust the autonomous loop. Adds formatOrchestrateReport() with 5-phase human report (sync, status, decisions, evolution, watch) and enriched JSON with per-skill decisions array. Supersedes PR #45. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update orchestrate workflow docs and changelog for decision report Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove redundant null check in formatEvolutionPhase The filter already guarantees evolveResult is defined; use non-null assertion instead of a runtime guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add defensive optional chaining for watch snapshot in JSON output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: avoid empty parentheses in watch report when snapshot missing Consolidates pass_rate and baseline into a single conditional metrics suffix so lines without a snapshot render cleanly. Addresses CodeRabbit review feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve biome lint and format errors Replace non-null assertion (!) with type-safe cast to satisfy noNonNullAssertion rule, and collapse single-arg lines.push to one line per biome formatter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

@WellDunDun

* Add make clean-branches target for repo hygiene Deletes Conductor worktree branches (custom/prefix/router-*), selftune evolve test branches, and orphaned worktree-agent-* branches. Also prunes stale remote tracking refs. Run with `make clean-branches`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add composability v2: synergy detection, sequence extraction, workflow candidates Extends the composability analysis with positive interaction detection (synergy scores), ordered skill sequence extraction from usage timestamps, and automatic workflow candidate flagging. Backwards compatible — v1 function and tests unchanged, CLI falls back to v1 when no usage log exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add workflow discovery, SKILL.md writer, and CLI command (v0.3) Implements multi-skill workflow support: discovers workflow patterns from existing telemetry, displays them via `selftune workflows`, and codifies them to SKILL.md via `selftune workflows save`. Includes 48 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review feedback for workflows * fix: stabilize biome config for CI lint * fix: address new PR review threads * Fix all 13 demo findings: grade/evolve/status/doctor/telemetry BUG-1: Remove false-positive git hook checks, fix hook key names to PascalCase BUG-2: Auto-derive expectations from SKILL.md when none provided BUG-3: Add --help output to grade command documenting --session-id BUG-4: Prefer skills_invoked over skills_triggered in session matching BUG-5: Add pre-flight validation and human-readable errors to evolve BUG-6: Distinguish real Skill tool calls from SKILL.md browsing reads IMP-1: Confirmed templates/ in package.json files array IMP-2: Auto-install agent files during init IMP-3: Show UNGRADED instead of CRITICAL when no graded sessions exist IMP-4: Use portable npx selftune hook <name> instead of absolute paths IMP-5: Add selftune auto-grade command IMP-6: Mandate AskUserQuestion in evolve workflows IMP-7: Add selftune quickstart command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix hook subcommand to spawn hooks as subprocesses Hook files guard execution behind import.meta.main, so dynamically importing them was a no-op. Spawn as subprocess instead so stdin payloads are processed and hooks write telemetry logs correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address review comments * Fix lint CI failures * Trigger CI rerun * Fix lint command resolution * Address remaining review comments * Fix grade-session test isolation * docs: Align selftune skill docs with shipped workflows (#31) * Update selftune workflow docs and skill versioning * Improve selftune skill portability and setup docs * Clarify workflow doc edge cases * Fix OpenClaw doctor validation and workflow docs * Polish composability and setup docs * Fix BUG-7, BUG-8, BUG-9 from demo findings (#32) * Fix BUG-7, BUG-8, BUG-9 from demo findings BUG-7: Add try/catch + array validation around eval-set file loading in evolve() so parse errors surface as user-facing messages instead of silent exit. BUG-8: Add cold-start bootstrap — when extractFailurePatterns returns empty but the eval set has positive entries, treat those positives as missed queries so evolve can work on skills with zero usage history. BUG-9: Add --out flag to evals CLI parseArgs as alias for --output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix evolve CI regressions * Isolate blog proof fixture mutations --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Fix dashboard data and extract telemetry contract (#33) * Fix dashboard export and layout * Improve telemetry normalization groundwork * Add test runner state * Separate and extract telemetry contract * Fix telemetry CI lint issues * Fix remaining CI regressions * Detect zero-trigger monitoring regressions * Stabilize dashboard report route tests * Address telemetry review feedback * Fix telemetry normalization edge cases (#34) * Fix telemetry follow-up edge cases * Fix rollback payload and Codex prompt attribution * Tighten Codex rollout prompt tracking * Update npm package metadata (#35) * Prepare 0.2.1 release (#36) * Prepare 0.2.1 release * Update README install path * Use trusted publishing for npm * feat: harden LLM calls and fix test failures (#38) * feat: consume @selftune/telemetry-contract as workspace package Replace relative path imports of telemetry-contract with the published @selftune/telemetry-contract workspace package. Adds workspace config to package.json and expands tsconfig includes to cover packages/*. Closes SEL-10 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(telemetry-contract): add versioning, metadata, and golden fixtures Add version 1.0.0 and package metadata (description, author, license, repository) to the telemetry-contract package. Create golden fixture file with one valid example per record kind and a test suite that validates all fixtures against the contract validator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make selftune source-truth driven * Harden live dashboard loading * Audit cleanup: test split, docs, lint fixes - Add make test-fast / test-slow targets (5s vs 80s, 16x faster dev loop) - Add bun run test:fast / test:slow scripts in package.json - Reposition README as "Claude Code first", update competitive comparison - Bump PRD.md version to 0.2.1 - Add CHANGELOG unreleased section (source-truth, telemetry-contract, test split) - Fix pre-existing lint: types.ts formatting, golden.test.ts import order Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add sync flags and hook dispatch to integration guide - Document all selftune sync flags (--since, --dry-run, --force, etc.) - Add selftune hook dispatch command with all 6 hook names - Verified init, activation rules, and source-truth sections already current Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden LLM calls and fix pre-existing test failures Add exponential backoff retry to callViaAgent for transient subprocess failures. Cap JSONL health-check validation at 500 lines to prevent timeouts on large log files. Use exported DEFAULT_WINDOW_SESSIONS constant in dashboard data collection instead of telemetry.length. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Add local SQLite materialization layer for dashboard (#42) * feat: add SQLite materialization layer for dashboard queries Add a local SQLite database (via bun:sqlite) as an indexed materialized view store so the dashboard/report UX no longer depends on recomputing everything from raw JSONL logs on every request. New module at cli/selftune/localdb/ with: - schema.ts: 10 tables + 19 indexes mirroring canonical telemetry and local log shapes - db.ts: openDb() lifecycle with WAL mode, meta key-value helpers - materialize.ts: full rebuild and incremental materialization from JSONL source-of-truth logs - queries.ts: getOverviewPayload(), getSkillReportPayload(), getSkillsList() query helpers Raw JSONL logs remain authoritative — the DB is a disposable cache that can always be rebuilt. No new npm dependencies (bun:sqlite only). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors Auto-fix import ordering, formatting, and replace non-null assertions with optional chaining in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: implement autonomous selftune orchestrator core loop (#40) * feat: add selftune orchestrate command for autonomous core loop Introduces `selftune orchestrate` — a single entry point that chains sync → status → evolve → watch into one coordinated run. Defaults to dry-run mode with explicit --auto-approve for deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — lint errors, logic bug, and type completeness - Replace string concatenation with template literals (Biome lint) - Add guard in evolve loop for agent-missing skip mutations - Replace non-null assertion with `as string` cast - Remove unused EvolutionAuditEntry import - Complete DoctorResult mock with required fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply Biome formatting and import sorting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Build(deps): Bump oven-sh/setup-bun from 2.1.2 to 2.1.3 (#26) Bumps [oven-sh/setup-bun](https://github.com/oven-sh/setup-bun) from 2.1.2 to 2.1.3. - [Release notes](https://github.com/oven-sh/setup-bun/releases) - [Commits](oven-sh/setup-bun@3d26778...ecf28dd) --- updated-dependencies: - dependency-name: oven-sh/setup-bun dependency-version: 2.1.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Build(deps): Bump actions/setup-node from 6.2.0 to 6.3.0 (#27) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 6.2.0 to 6.3.0. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@6044e13...53b8394) --- updated-dependencies: - dependency-name: actions/setup-node dependency-version: 6.3.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Build(deps): Bump github/codeql-action from 4.32.4 to 4.32.6 (#28) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.32.4 to 4.32.6. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@89a39a4...0d579ff) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.32.6 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Improve sync progress and tighten query filtering (#43) * Improve sync progress and tighten query filtering * Fix biome formatting errors in sync.ts and query-filter.test.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: generic scheduling and reposition OpenClaw cron as optional (#41) * feat: add generic scheduling command and reposition OpenClaw cron as optional The primary automation story is now agent-agnostic. `selftune schedule` generates ready-to-use snippets for system cron, macOS launchd, and Linux systemd timers. `selftune cron` is repositioned as an optional OpenClaw integration rather than the main automation path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — centralize schedule data, fix generators and formatting Derive SCHEDULE_ENTRIES from DEFAULT_CRON_JOBS (single source of truth), generate launchd/systemd configs for all 4 entries instead of sync-only, fix biome formatting, and add markdown language tag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use StartCalendarInterval for fixed-time launchd and shell wrappers for chained commands - launchd: use StartCalendarInterval (Hour/Minute/Weekday) for fixed-time schedules instead of approximating with StartInterval - launchd/systemd: use /bin/sh -c wrapper for commands with && chains so prerequisite steps (like sync) are not silently dropped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add local dashboard SPA with React + Vite (#39) * feat: add local dashboard SPA with React + Vite Introduces a minimal React SPA at apps/local-dashboard/ with two routes: overview (KPIs, skill health grid, evolution feed) and per-skill drilldown (pass rate, invocation breakdown, evaluation records). Consumes existing dashboard-server API endpoints with SSE live updates, explicit loading/ error/empty states, and design tokens matching the current dashboard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback for local dashboard Extract shared utils (deriveStatus, formatRate, timeAgo), add SSE exponential backoff with max retries, filter ungraded skills from avg pass rate, fix stuck loading state for undefined skillName, use word-boundary regex for evolution filtering, add focus-visible styles, add typecheck script, and add Vite env types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address second round CodeRabbit review feedback Cancel pending SSE reconnect timers on cleanup, add stale-request guard to useSkillReport, remove redundant decodeURIComponent (React Router already decodes), quote font names in CSS for stylelint, and format deriveStatus signature for Biome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: align local dashboard SPA with SQLite v2 data architecture Migrate SPA from old JSONL-reading /api/data endpoints to new SQLite-backed /api/v2/* endpoints. Add v2 server routes for overview and per-skill reports. Replace SSE with 15s polling. Rewrite types to match materialized query shapes from queries.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback on local dashboard SPA - Add language identifier to HANDOFF.md fenced code block (MD040) - Prevent overlapping polls in useOverview with in-flight guard and sequential setTimeout - Broaden empty-state check in useSkillReport to include evolution/proposals - Fix Sessions KPI to use counts.sessions instead of counts.telemetry - Wrap materializeIncremental in try/catch to preserve last good snapshot on failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sort imports to satisfy Biome organizeImports lint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit nitpicks — cross-platform dev script, stricter types, CSS compat - Use concurrently for cross-platform dev script instead of shell backgrounding - Tighten Sidebar counts prop to Partial<Record<SkillHealthStatus, number>> - Replace color-mix() with rgba fallback for broader browser support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add UNKNOWN status filter and extract header height CSS variable - Add UNKNOWN to STATUS_OPTIONS so all SkillHealthStatus values are filterable - Extract hardcoded 56px header height to --header-h CSS variable Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: hoist sidebar collapse state to layout and add UNKNOWN filter style - Lift collapsed state from Sidebar to Overview so grid columns resize properly - Add .sidebar-collapsed grid rules at all breakpoints - Fix mobile: collapsed sidebar no longer creates dead-end (shows inline) - Add .filter-pill.active.filter-unknown CSS rule for UNKNOWN status Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: serve SPA as default dashboard, legacy at /legacy/ - Dashboard server now serves built SPA from apps/local-dashboard/dist/ at / - Legacy dashboard moved to /legacy/ route - SPA fallback for client-side routes (e.g. /skills/:name) - Static asset serving with content-hashed caching for /assets/* - Path traversal protection on static file serving - Add build:dashboard script to root package.json - Include apps/local-dashboard/dist/ in published files - Falls back to legacy dashboard if SPA build not found Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add shadcn theming with dark/light toggle and selftune branding Migrate dashboard to shadcn theme system with proper light/dark support. Dark mode uses selftune site colors (navy/cream/copper), light mode uses standard shadcn defaults. Add ThemeProvider with localStorage persistence, sun/moon toggle in site header, and SVG logo with currentColor for both themes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: path traversal check and 404 for missing skills Use path.relative() + isAbsolute() instead of startsWith() for the SPA static asset path check to prevent directory traversal bypass. Return 404 from /api/v2/skills/:name when the skill has no usage data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome formatting — semicolons, import order, line length Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — dedupe polling, fix stale closures, harden theme/config - Lift useOverview to DashboardShell, pass as prop to Overview (no double polling) - Fix stale closure in drag handler by deriving indices from prev state - Validate localStorage theme values, use undefined context default - Add relative positioning to theme toggle button for MoonIcon overlay - Fix falsy check hiding zero values in chart tooltip - Fix invalid Tailwind selectors in dropdown-menu and toggle-group - Use ESM-safe fileURLToPath instead of __dirname in vite.config - Switch manualChunks to function form for Base UI subpath matching - Align pass-rate threshold with deriveStatus in SkillReport - Use local theme provider in sonner instead of next-themes - Add missing React import in skeleton, remove unused Separator import - Include vite.config.ts in tsconfig for typecheck coverage - Fix inconsistent JSX formatting in select scroll buttons Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 2 — shared sorting, DnD fixes, Tailwind v4 migration - Extract sortByPassRateAndChecks to utils.ts, dedupe sorting in App + Overview - Derive DnD dataIds from row model (not raw data), guard against -1 indexOf - Hide pagination when table is empty instead of showing "Page 1 of 0" - Fix ActivityTimeline default tab to prefer non-empty dataset - Import ReactNode directly instead of undeclared React namespace - Quote CSS attribute selector in chart style injection - Use stable composite keys for tooltip and legend items - Remove unnecessary "use client" directive from dropdown-menu (Vite SPA) - Migrate outline-none to outline-hidden for Tailwind v4 accessibility - Fix toggle-group orientation selectors to match data-orientation attribute - Add missing CSSProperties import in sonner.tsx - Add dark mode variant for SkillReport row highlight - Format vite.config.ts with Biome Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add evidence viewer, evolution timeline, and enhanced skill report Add EvidenceViewer, EvolutionTimeline, and InfoTip components. Enhance SkillReport with richer data display, expand dashboard server API endpoints, and update documentation and architecture docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 3 — DnD/sort conflict, theme listener, formatting - Disable DnD reorder when table sorting is active (skill-health-grid) - Listen for OS theme preference changes when system theme is active - Apply Biome formatting to sortByPassRateAndChecks - Remove unused useEffect import from Overview - Deduplicate confidence filter in SkillReport - Materialize session IDs once in dashboard-server to avoid repeated subqueries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: show selftune version in sidebar footer Pass version from API response through to AppSidebar and display it dynamically instead of hardcoded "dashboard v0.1". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome formatting in dashboard-server — line length wrapping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 4 — dedupe formatRate, STATUS_CONFIG, cleanup - Remove duplicate formatRate from app-sidebar, import from @/utils - Extract STATUS_CONFIG to shared @/constants module, import in both skill-health-grid and SkillReport - Remove misleading '' fallback from sessionPlaceholders since the ternary guards already skip queries when empty Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove redundant items prop from Select to avoid duplication The SelectItem children already define the options; the items prop was duplicating them unnecessarily. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add sortableKeyboardCoordinates to KeyboardSensor for proper keyboard DnD Without this, keyboard navigation moves by pixels instead of jumping between sortable items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Linear-style dashboard UX — collapsible sidebar, direct skill links, scope grouping - Simplify sidebar: remove status filters, keep logo + search + skills list - Add collapsible scope groups (Project/Global) using base-ui Collapsible - Surface skill_scope from DB query through API to dashboard types - Replace skill drawer with direct Link navigation to skill report - Add Scope column to skills table with filter dropdown - Slim down site header: remove breadcrumbs, reduce to sidebar trigger + theme toggle - Add side-by-side grid layout: skills table left, activity panel right - Gitignore pnpm-lock.yaml alongside bun.lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — accessibility, semantics, state reset - Remove bun.lock from .gitignore to maintain build reproducibility - Preserve unexpected scope values in sidebar (don't drop unrecognized scopes) - Add aria-label to skill search input for screen reader accessibility - Switch status filter from checkbox to radio-group semantics (mutually exclusive) - Reset selectedProposal when navigating between skills via useEffect on name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add TanStack Query and optimize SQL queries for dashboard performance Migrate data fetching from manual polling/dedup hooks to TanStack Query for instant cached navigation, background refetch, and request dedup. Optimize SQL: replace NOT IN subqueries with LEFT JOIN, move JS dedup to GROUP BY, add LIMIT 200 to unbounded evidence queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: track root bun.lock for reproducible installs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — collapsible sync, drag handle dedup, a11y, not-found heuristic - Make sidebar Collapsible controlled so it auto-opens when active skill changes (Comment #1) - Consolidate useSortable to single call per row via React context, use setActivatorNodeRef on drag handle button (Comment #2) - Remove capitalize CSS transform on free-form scope values (Comment #3) - Broaden isNotFound heuristic to check invocations, prompts, sessions in addition to evals/evolution/proposals (Comment #4) - Move Tooltip outside TabsTrigger to avoid nested interactive elements, use Base UI render prop for composition (Comment #5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit nitpicks — version pinning, changelog clarity, shared query helper - Use caret range for recharts version (^2.15.4) for consistency - Clarify changelog: SSE was removed, polling via refetchInterval is primary - Extract getPendingProposals() shared helper in queries.ts, used by both getOverviewPayload() and dashboard-server skill report endpoint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 3 — deps, async fs, type safety, deterministic query - Move @tailwindcss/vite, tailwindcss, shadcn to devDependencies - Fix trailing space in version display when version is empty - Type caught error as unknown in refreshV2Data - Replace sync fs (readFileSync/statSync) with Bun.file() for hot-path asset serving - Return 404 for missing /assets/* files instead of falling through to SPA - Add details and eval_set fields to SkillReportPayload.evidence type - Fix nondeterministic GROUP BY with ROW_NUMBER() CTE in getPendingProposals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors in CI - Replace non-null assertion with type cast in useSkillReport (noNonNullAssertion) - Break long import line in dashboard-server.ts to satisfy Biome formatter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 4 — CTE subqueries, type alignment, scope index - Replace dynamic bind-parameter expansion with CTE subquery for session lookups - Add skill_name to OverviewPayload.pending_proposals type to match runtime shape - Add composite index on skill_usage(skill_name, skill_scope, timestamp) for scope lookups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 5 — startup guard, 404 heuristic, deterministic tiebreaker - Guard initial v2 materialization with try/catch to avoid full server crash - Include evidence in not-found check so evidence-only skills aren't 404'd - Add ea.id DESC tiebreaker to ROW_NUMBER() for deterministic pending proposals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 6 — db guard, refresh throttle, deferred 404 - Guard openDb() in try/catch so DB bootstrap failure doesn't crash server - Make db nullable, return 503 from /api/v2/* when store is unavailable - Throttle failed refresh attempts with separate lastV2RefreshAttemptAt timestamp - Move skill 404 check after enrichment queries (evolution, proposals, invocations) - Use optional chaining for db.close() on shutdown Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Prepare SPA dashboard release path (#44) * Promote product planning docs * Add execution plans for product gaps and evals * Prepare SPA dashboard release path * Remove legacy dashboard runtime * Refresh execution plans after dashboard cutover * Build dashboard SPA in CI and publish * Refresh README for SPA release path * Address dashboard release review comments * Fix biome lint errors in dashboard tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make autonomous loop the default scheduler path * Document orchestrate as the autonomous loop * Document autonomy-first setup path * Harden autonomous scheduler install paths * Clarify sync force usage in README --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: phased decision report for orchestrator explainability (#48) * feat: add phased decision report to orchestrator Orchestrate output now explains each decision clearly so users can trust the autonomous loop. Adds formatOrchestrateReport() with 5-phase human report (sync, status, decisions, evolution, watch) and enriched JSON with per-skill decisions array. Supersedes PR #45. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update orchestrate workflow docs and changelog for decision report Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove redundant null check in formatEvolutionPhase The filter already guarantees evolveResult is defined; use non-null assertion instead of a runtime guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add defensive optional chaining for watch snapshot in JSON output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: avoid empty parentheses in watch report when snapshot missing Consolidates pass_rate and baseline into a single conditional metrics suffix so lines without a snapshot render cleanly. Addresses CodeRabbit review feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve biome lint and format errors Replace non-null assertion (!) with type-safe cast to satisfy noNonNullAssertion rule, and collapse single-arg lines.push to one line per biome formatter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: evidence-based candidate selection gating (#50) * feat: evidence-based candidate selection with cooldown, evidence, and trend gates Add four gating rules to selectCandidates so autonomous evolution acts on stronger signals and skips noisy/premature candidates: - Cooldown gate: skip skills deployed within 24h - Evidence gate: require 3+ skill_checks for CRITICAL/WARNING - Weak-signal filter: skip WARNING with 0 missed queries + non-declining trend - Trend boost: declining skills prioritized higher in sort order Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use epoch-ms comparison for timestamp gating, use cooldown constant in test Fixes two CodeRabbit review issues: - Timestamp comparisons in findRecentlyDeployedSkills and findRecentlyEvolvedSkills now use Date.parse + epoch-ms instead of lexicographic string comparison, which breaks on non-UTC offsets - Test derives oldTimestamp from DEFAULT_COOLDOWN_HOURS instead of hardcoding 48, fixing the unused import lint error Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix import formatting in orchestrate test to satisfy CI Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply biome formatting to orchestrate and tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * E2E autonomy proof harness for evolution pipeline (#51) * feat: add e2e autonomy proof harness for evolution pipeline Proves three core autonomous evolution claims with 8 deterministic tests: - Autonomous deploy: orchestrate selects WARNING skill, evolve deploys real SKILL.md - Regression detection: watch fires alert when pass rate drops below baseline - Auto-rollback: deploy→regression→rollback restores original file from backup Uses dependency injection to skip LLM calls while exercising real file I/O (deployProposal writes, rollback restores, audit trail persists). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve Biome lint errors in autonomy-proof test Sort imports, fix formatting, remove unused imports, replace non-null assertions with optional chaining. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: persist orchestrate run reports in dashboard (#49) * feat: persist orchestrate run reports and expose in dashboard SPA Orchestrate now writes a structured run report (JSONL) after each run, materialized into SQLite for the dashboard. A new "Orchestrate Runs" panel on the Overview page lets users inspect what selftune did, why skills were selected/skipped/deployed, and review autonomous decisions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit review findings - Handle rolledBack state in OrchestrateRunsPanel badge rendering - Show loading/error states instead of false empty state for orchestrate runs - Move ORCHESTRATE_RUN_LOG to LOG_DIR (~/.claude) per log-path convention - Validate limit param with 400 error for non-numeric input - Derive run report counts from final candidates instead of stale summary - Include error message in appendJsonl catch for diagnosability Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update autonomy-proof test fixtures for new candidate selection gates After merging dev, selectCandidates gained cooldown, evidence, and weak-signal gates. The test fixtures used snapshot: null and trend: "declining", which caused skills to be skipped by the insufficient-evidence gate and missed the renamed trend value "down". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: unify summary totals to prevent CLI/dashboard metric drift Both result.summary and runReport now derive from a single finalTotals object computed from the final candidates array, eliminating the possibility of divergent counts between CLI output and persisted dashboard data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: demo-ready CLI consolidation + autonomous orchestration (#52) * Refresh architecture and operator docs * docs: align docs and skill workflows with autonomy-first operator path Reframes operator guide around autonomy-first setup, adds orchestrate runs endpoint to architecture/dashboard docs, and updates skill workflows to recommend --enable-autonomy as the default initialization path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: autoresearch-inspired UX improvements for demo readiness - orchestrate --loop: continuous autonomous improvement cycle with configurable interval - evolve: default cheap-loop on, add --full-model escape hatch, show diff after deploy - bare `selftune` shows status dashboard instead of help text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: show selftune resource usage instead of session-level metrics in skill report Skill report cards now display selftune's own LLM calls and evolution duration per skill (from orchestrate_runs) instead of misleading session-level token/duration aggregates. Also extracts tokens and duration from transcripts into canonical execution facts for future use. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: consolidate CLI from 28 flat commands to 21 grouped commands Group 15 related commands under 4 parent commands: - selftune ingest <agent> (claude, codex, opencode, openclaw, wrap-codex) - selftune grade [mode] (auto, baseline) - selftune evolve [target] (body, rollback) - selftune eval <action> (generate, unit-test, import, composability) Update all 39 files: router, subcommand help text, SKILL.md, workflow docs, design docs, README, PRD, CHANGELOG, and agent configs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit PR review comments - Filter skip/watch actions from selftune_stats run counts - Restore legacy token_usage/duration_stats from execution_facts - Cooperative SIGINT/SIGTERM shutdown for orchestrate loop - Validate --window as positive integer with error message - Add process.exit guard for bare selftune status fallthrough - Update ARCHITECTURE.md import matrix for Dashboard dependencies - Fix adapter count, code fence languages, and doc terminology Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address remaining review comments and stale replay references - Escape SQL LIKE wildcards in dashboard skill name query - Add Audit + Rollback steps to SKILL.md feedback loop - Fix stale "replay" references in quickstart help text and quickstart.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve CI lint failures - Fix dashboard-server.ts indentation on LIKE escape pattern - Prefix unused deployedCount/watchedCount with underscore - Format api.ts import to multi-line per biome rules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: escape backslash in SQL LIKE pattern to satisfy CodeQL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add system status page to dashboard with doctor diagnostics Surfaces the doctor health checks (config, log files, hooks, evolution) through a new /status route in the dashboard SPA, so humans can monitor selftune health without touching the CLI. - Add GET /api/v2/doctor endpoint to dashboard server - Add DoctorResult/HealthCheck types to dashboard contract - Create Status page with grouped checks, summary cards, auto-refresh - Add System Status link in sidebar footer - Update all related docs (ARCHITECTURE, HANDOFF, system-overview, Dashboard workflow) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add ESCAPE clause to LIKE query and fix stale replay label - SQLite LIKE needs explicit ESCAPE '\\' for backslash escapes to work - Rename "Replay failed" to "Ingest failed" in quickstart error output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit review comments on status page PR - Add "Other" fallback group for unknown check types in Status page - Use compound key (name+idx) to avoid React key collisions - Re-export DoctorResult types from types.ts instead of duplicating - Fix orchestrate loop sleep deadlock on SIGINT/SIGTERM - Replace stale SSE references with polling-based refresh in Dashboard docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: establish agent-first architecture principle across repo selftune is a skill consumed by agents, not a CLI tool for humans. Users install the skill and talk to their agent ("improve my skills"), the agent reads SKILL.md, routes to workflows, and runs CLI commands. - AGENTS.md: add Agent-First Architecture section + dev guidance - ARCHITECTURE.md: add Agent-First Design Principle at top - SKILL.md: add agent-addressing preamble ("You are the operator") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: demo-ready P0 fixes from architecture audit Four parallel agent implementations: 1. SKILL.md trigger keywords: added natural-language triggers across 10 workflows + 13 new user-facing examples ("set up selftune", "improve my skills", "how are my skills doing", etc.) 2. Hook auto-merge: selftune init now automatically merges hooks into ~/.claude/settings.json for Claude Code — no manual settings editing. Initialize.md updated to reflect auto-install. 3. Cold-start fallback: quickstart detects empty telemetry after ingest and shows hook-discovered skills or guidance message instead of blank output. No LLM calls, purely data-driven. 4. Dashboard build: added prepublishOnly script to ensure SPA is built before npm publish (CI already did this, but local publish was not). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: defensive checks fallback and clarify reserved counters - Status.tsx: default checks to [] if API returns undefined - orchestrate.ts: annotate _deployedCount/_watchedCount as reserved Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: prioritize Claude Code, unify cron/schedule, remove dead code - Mark Codex/OpenCode/OpenClaw as experimental across docs, SKILL.md, CLI help text, and README. Claude Code is the primary platform. - Unify cron and schedule into `selftune cron` with --platform flag for agent-specific setup. `selftune schedule` kept as alias. - Remove dead _deployedCount/_watchedCount counters from orchestrate.ts (summary already computed via array filters in Step 7). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: document two operating modes and data architecture - ARCHITECTURE.md: add Interactive vs Automated mode explanation, document JSONL-first data flow with SQLite as materialized view - Cron.md: fix stale orchestrate schedule (weekly → every 6 hours), correct "agent runs" to "OS scheduler calls CLI directly" - Orchestrate.md: add execution context table (interactive vs automated) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite all 22 workflow docs for agent-first consistency Three parallel agents rewrote workflow docs so the agent (not the human) is the operator: Critical (3 files): Evolve.md, Evals.md, Baseline.md - Pre-flight sections now have explicit selection-to-flag mapping tables - Agent knows exactly how to parse user choices into CLI commands Moderate (11 files): Initialize, Dashboard, Watch, Grade, Contribute, UnitTest, Sync, AutoActivation, Orchestrate, Doctor, Replay - "When to Use" sections rewritten as agent trigger conditions - "Common Patterns" converted from user quotes to agent decision logic - Steps use imperative agent voice throughout - Replay.md renamed to "Ingest (Claude) Workflow" with compatibility note Minor (8 files): Composability, Schedule, Cron, Badge, Workflows, EvolutionMemory, ImportSkillsBench, Ingest - Added missing "When to Use" sections - Added error handling guidance - Fixed agent voice consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add autonomous mode + connect agents to workflows Autonomous mode: - Evolve, Watch, Grade, Orchestrate workflows now document their behavior when called by selftune orchestrate (no user interaction, defaults used, pre-flight skipped, auto-rollback enabled) - SKILL.md routing table marks autonomous workflows with † Agent connections: - All 4 agents (.claude/agents/) now have "Connection to Workflows" sections explaining when the main agent should spawn them - Key workflows (Evolve, Doctor, Composability, Initialize) now have "Subagent Escalation" sections referencing the relevant agent - SKILL.md agents table adds "When to spawn" column with triggers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: remove duplicate findRecentlyEvolvedSkills function findRecentlyEvolvedSkills was identical to findRecentlyDeployedSkills. Consolidated into one function used for both cooldown gating and watch targeting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 21 CodeRabbit review comments AGENTS.md: - Add missing hook files and openclaw-ingest to project tree - Use selftune cron as canonical scheduling command Status.tsx: - Add aria-label and title to refresh button for accessibility ARCHITECTURE.md: - Use canonical JSONL filenames matching constants.ts - Add text language specifier to code block index.ts: - Add --help handlers for grade and evolve grouped commands - Add --help handler for eval composability before parseArgs quickstart.ts: - Fix stale "Replay" comment to "Ingest" Workflow docs: - Cron.md: fix --format to --platform, add text fence, add --skill-path - Evals.md: fix HTML entities to literal angle brackets - Evolve.md: replace placeholder with actual --pareto flag - Grade.md: clarify results come from grading.json not stdout - Ingest.md: fix wrap-codex error guidance (no --verbose flag) - Initialize.md: use full selftune command form, fix relative path - Orchestrate.md: fix token cost contradiction, document --loop mode - Sync.md: clarify synced=0 is valid, fix output parsing guidance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: real-time improvement signal detection and reactive orchestration Core feature: selftune now detects when users correct skill misses ("why didn't you use X?", "please use the commit skill") and triggers focused improvement automatically when the session ends. Signal detection (prompt-log.ts): - Pure regex patterns detect corrections, explicit requests - Extracts mentioned skill name from query text - Appends to improvement_signals.jsonl (zero LLM cost) Reactive trigger (session-stop.ts): - Checks for pending signals when session ends - Spawns background selftune orchestrate if signals exist - Lockfile prevents concurrent runs (30-min stale threshold) Signal-aware orchestrator (orchestrate.ts): - Reads pending signals at startup (no new CLI flags) - Boosts priority of signaled skills (+150 per signal, cap 450) - Signaled skills bypass evidence and UNGRADED gates - Marks signals consumed after run completes - Lockfile acquire/release wrapping full orchestrate body Tests: 32 new tests across 2 files (signal detection + orchestrator) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: document signal-reactive improvement across architecture docs - ARCHITECTURE.md: add Signal-Reactive Improvement section with mermaid sequence diagram showing signal flow from prompt-log to orchestrate - Orchestrate.md: add Signal-Reactive Trigger section with guard rails - evolution-pipeline.md: add signal detection as pipeline input - system-overview.md: add signal-reactive path to system overview - logs.md: document improvement_signals.jsonl format and fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 14 CodeRabbit review comments (round 4) Agents: - diagnosis-analyst: resolve activation contradiction (subagent-only) - evolution-reviewer: inspect recorded eval file, not regenerated one - integration-guide: carry --skill flag through to evolve command Dashboard: - Status.tsx: defensive fallback for unknown health status values CLI: - index.ts: remove redundant process.exit(0) after statusMain - index.ts: strict regex validation for --window (reject "10days") Quickstart: - Remove misleading [2/3] prefix from post-step check Workflows: - SKILL.md: add text language specifier to feedback loop diagram - Initialize.md: add blank line + text specifier to code block - Orchestrate.md: fix sync step to use selftune sync not ingest claude - Doctor.md: route missing-telemetry fixes by agent platform - Evals.md: add skill-path to synthetic pre-flight, note haiku alias - Ingest.md: wrap-codex uses wrapper not hooks for telemetry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: dependency map, README refresh, dashboard signal exec plan 1. AGENTS.md: Add Change Propagation Map — "if you change X, update Y" table that agents check before committing. Prevents stale docs. 2. README.md: Refresh for v0.2 architecture: - Agent-first framing ("tell your agent" not "run this command") - Grouped commands table (ingest, grade, evolve, eval, auto) - Signal-reactive detection mentioned in Detect section - Automate section with selftune cron setup - Removed CLI-centric use case descriptions 3. Exec plan for dashboard signal integration (planned, not started): - Schema + materialization + queries + contract + API + UI - 3 parallel agent workstreams, ~4.5 hours estimated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: migrate repo URLs from WellDunDun to selftune-dev org Update all repo path references (badges, clone URLs, install command, contribute PR target, security tab link, llms.txt) from personal WellDunDun/selftune to org selftune-dev/selftune. Kept as WellDunDun (personal account, not repo path): - CODEOWNERS (@WellDunDun) - FUNDING.yml (sponsors/WellDunDun) - LICENSE copyright - PRD owner field Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add repo org/name migration to change propagation map Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 13 CodeRabbit review comments (round 5) Code fixes: - prompt-log.ts: broaden skill capture regex to [\w-]+ for hyphenated names - session-stop.ts: atomic lock acquisition with openSync("wx") + cleanup - orchestrate.ts: re-read signal log before write to prevent race condition Dashboard: - Status.tsx: defensive defaults for healthy, summary, timestamp Docs: - ARCHITECTURE.md: use selftune cron setup as canonical scheduler - Orchestrate.md: fix loop-interval default (3600s not 300s) - Evals.md: fix option numbering (1-5) and selection mapping (4a/4b/4c) - Initialize.md: use selftune init --force instead of repo-relative paths - logs.md: document signal consumption as exception to append-only Tests: - signal-detection: fix vacuous unknown-skill test - signal-orchestrate: exercise missing-log branch with non-empty signals Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: llms.txt branch-agnostic links, README experimental clarity - llms.txt: /blob/master/ → /blob/HEAD/ for branch-agnostic URLs - README line 28: clarify Claude Code primary, others experimental - README line 38: remove redundant "Within minutes" - README footer: match experimental language from Platforms section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

coderabbitai bot reviewed Mar 14, 2026

View reviewed changes

WellDunDun mentioned this pull request Mar 14, 2026

feat: phased decision report for orchestrator explainability #48

Merged

4 tasks

WellDunDun closed this Mar 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add phased decision report to orchestrator#45

feat: add phased decision report to orchestrator#45
WellDunDun wants to merge 1 commit intodevfrom
WellDunDun/orchestrator-explain

WellDunDun commented Mar 14, 2026

Uh oh!

coderabbitai bot commented Mar 14, 2026 •

edited

Loading

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 14, 2026

Uh oh!

WellDunDun commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

WellDunDun commented Mar 14, 2026

Changes

Uh oh!

coderabbitai bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

WellDunDun commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Mar 14, 2026 •

edited

Loading